{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 第10章 隐马尔可夫模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1．隐马尔可夫模型是关于时序的概率模型，描述由一个隐藏的马尔可夫链随机生成不可观测的状态的序列，再由各个状态随机生成一个观测而产生观测的序列的过程。\n",
    "\n",
    "隐马尔可夫模型由初始状态概率向$\\pi$、状态转移概率矩阵$A$和观测概率矩阵$B$决定。因此，隐马尔可夫模型可以写成$\\lambda=(A, B, \\pi)$。\n",
    "\n",
    "隐马尔可夫模型是一个生成模型，表示状态序列和观测序列的联合分布，但是状态序列是隐藏的，不可观测的。\n",
    "\n",
    "隐马尔可夫模型可以用于标注，这时状态对应着标记。标注问题是给定观测序列预测其对应的标记序列。\n",
    "\n",
    "2．概率计算问题。给定模型$\\lambda=(A, B, \\pi)$和观测序列$O＝(o_1，o_2,…,o_T)$，计算在模型$\\lambda$下观测序列$O$出现的概率$P(O|\\lambda)$。前向-后向算法是通过递推地计算前向-后向概率可以高效地进行隐马尔可夫模型的概率计算。\n",
    " \n",
    "3．学习问题。已知观测序列$O＝(o_1，o_2,…,o_T)$，估计模型$\\lambda=(A, B, \\pi)$参数，使得在该模型下观测序列概率$P(O|\\lambda)$最大。即用极大似然估计的方法估计参数。Baum-Welch算法，也就是EM算法可以高效地对隐马尔可夫模型进行训练。它是一种非监督学习算法。\n",
    "\n",
    "4．预测问题。已知模型$\\lambda=(A, B, \\pi)$和观测序列$O＝(o_1，o_2,…,o_T)$，求对给定观测序列条件概率$P(I|O)$最大的状态序列$I＝(i_1，i_2,…,i_T)$。维特比算法应用动态规划高效地求解最优路径，即概率最大的状态序列。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "class HiddenMarkov:\n",
    "    def __init__(self):\n",
    "        self.alphas = None\n",
    "        self.forward_P = None\n",
    "        self.betas = None\n",
    "        self.backward_P = None\n",
    "\n",
    "    # 前向算法\n",
    "    def forward(self, Q, V, A, B, O, PI):\n",
    "        # 状态序列的大小\n",
    "        N = len(Q)\n",
    "        # 观测序列的大小\n",
    "        M = len(O)\n",
    "        # 初始化前向概率alpha值\n",
    "        alphas = np.zeros((N, M))\n",
    "        # 时刻数=观测序列数\n",
    "        T = M\n",
    "        # 遍历每一个时刻，计算前向概率alpha值\n",
    "        for t in range(T):\n",
    "            # 得到序列对应的索引\n",
    "            indexOfO = V.index(O[t])\n",
    "            # 遍历状态序列\n",
    "            for i in range(N):\n",
    "                # 初始化alpha初值\n",
    "                if t == 0:\n",
    "                    # P176 公式(10.15)\n",
    "                    alphas[i][t] = PI[t][i] * B[i][indexOfO]\n",
    "                    print('alpha1(%d) = p%db%db(o1) = %f' %\n",
    "                          (i + 1, i, i, alphas[i][t]))\n",
    "                else:\n",
    "                    # P176 公式(10.16)\n",
    "                    alphas[i][t] = np.dot([alpha[t - 1] for alpha in alphas],\n",
    "                                          [a[i] for a in A]) * B[i][indexOfO]\n",
    "                    print('alpha%d(%d) = [sigma alpha%d(i)ai%d]b%d(o%d) = %f' %\n",
    "                          (t + 1, i + 1, t - 1, i, i, t, alphas[i][t]))\n",
    "        # P176 公式(10.17)\n",
    "        self.forward_P = np.sum([alpha[M - 1] for alpha in alphas])\n",
    "        self.alphas = alphas\n",
    "\n",
    "    # 后向算法\n",
    "    def backward(self, Q, V, A, B, O, PI):\n",
    "        # 状态序列的大小\n",
    "        N = len(Q)\n",
    "        # 观测序列的大小\n",
    "        M = len(O)\n",
    "        # 初始化后向概率beta值，P178 公式(10.19)\n",
    "        betas = np.ones((N, M))\n",
    "        #\n",
    "        for i in range(N):\n",
    "            print('beta%d(%d) = 1' % (M, i + 1))\n",
    "        # 对观测序列逆向遍历\n",
    "        for t in range(M - 2, -1, -1):\n",
    "            # 得到序列对应的索引\n",
    "            indexOfO = V.index(O[t + 1])\n",
    "            # 遍历状态序列\n",
    "            for i in range(N):\n",
    "                # P178 公式(10.20)\n",
    "                betas[i][t] = np.dot(\n",
    "                    np.multiply(A[i], [b[indexOfO] for b in B]),\n",
    "                    [beta[t + 1] for beta in betas])\n",
    "                realT = t + 1\n",
    "                realI = i + 1\n",
    "                print('beta%d(%d) = sigma[a%djbj(o%d)beta%d(j)] = (' %\n",
    "                      (realT, realI, realI, realT + 1, realT + 1),\n",
    "                      end='')\n",
    "                for j in range(N):\n",
    "                    print(\"%.2f * %.2f * %.2f + \" %\n",
    "                          (A[i][j], B[j][indexOfO], betas[j][t + 1]),\n",
    "                          end='')\n",
    "                print(\"0) = %.3f\" % betas[i][t])\n",
    "        # 取出第一个值\n",
    "        indexOfO = V.index(O[0])\n",
    "        self.betas = betas\n",
    "        # P178 公式(10.21)\n",
    "        P = np.dot(np.multiply(PI, [b[indexOfO] for b in B]),\n",
    "                   [beta[0] for beta in betas])\n",
    "        self.backward_P = P\n",
    "        print(\"P(O|lambda) = \", end=\"\")\n",
    "        for i in range(N):\n",
    "            print(\"%.1f * %.1f * %.5f + \" %\n",
    "                  (PI[0][i], B[i][indexOfO], betas[i][0]),\n",
    "                  end=\"\")\n",
    "        print(\"0 = %f\" % P)\n",
    "\n",
    "    # 维特比算法\n",
    "    def viterbi(self, Q, V, A, B, O, PI):\n",
    "        # 状态序列的大小\n",
    "        N = len(Q)\n",
    "        # 观测序列的大小\n",
    "        M = len(O)\n",
    "        # 初始化daltas\n",
    "        deltas = np.zeros((N, M))\n",
    "        # 初始化psis\n",
    "        psis = np.zeros((N, M))\n",
    "        # 初始化最优路径矩阵，该矩阵维度与观测序列维度相同\n",
    "        I = np.zeros((1, M))\n",
    "        # 遍历观测序列\n",
    "        for t in range(M):\n",
    "            # 递推从t=2开始\n",
    "            realT = t + 1\n",
    "            # 得到序列对应的索引\n",
    "            indexOfO = V.index(O[t])\n",
    "            for i in range(N):\n",
    "                realI = i + 1\n",
    "                if t == 0:\n",
    "                    # P185 算法10.5 步骤(1)\n",
    "                    deltas[i][t] = PI[0][i] * B[i][indexOfO]\n",
    "                    psis[i][t] = 0\n",
    "                    print('delta1(%d) = pi%d * b%d(o1) = %.2f * %.2f = %.2f' %\n",
    "                          (realI, realI, realI, PI[0][i], B[i][indexOfO],\n",
    "                           deltas[i][t]))\n",
    "                    print('psis1(%d) = 0' % (realI))\n",
    "                else:\n",
    "                    # # P185 算法10.5 步骤(2)\n",
    "                    deltas[i][t] = np.max(\n",
    "                        np.multiply([delta[t - 1] for delta in deltas],\n",
    "                                    [a[i] for a in A])) * B[i][indexOfO]\n",
    "                    print(\n",
    "                        'delta%d(%d) = max[delta%d(j)aj%d]b%d(o%d) = %.2f * %.2f = %.5f'\n",
    "                        % (realT, realI, realT - 1, realI, realI, realT,\n",
    "                           np.max(\n",
    "                               np.multiply([delta[t - 1] for delta in deltas],\n",
    "                                           [a[i] for a in A])), B[i][indexOfO],\n",
    "                           deltas[i][t]))\n",
    "                    psis[i][t] = np.argmax(\n",
    "                        np.multiply([delta[t - 1] for delta in deltas],\n",
    "                                    [a[i] for a in A]))\n",
    "                    print('psis%d(%d) = argmax[delta%d(j)aj%d] = %d' %\n",
    "                          (realT, realI, realT - 1, realI, psis[i][t]))\n",
    "        #print(deltas)\n",
    "        #print(psis)\n",
    "        # 得到最优路径的终结点\n",
    "        I[0][M - 1] = np.argmax([delta[M - 1] for delta in deltas])\n",
    "        print('i%d = argmax[deltaT(i)] = %d' % (M, I[0][M - 1] + 1))\n",
    "        # 递归由后向前得到其他结点\n",
    "        for t in range(M - 2, -1, -1):\n",
    "            I[0][t] = psis[int(I[0][t + 1])][t + 1]\n",
    "            print('i%d = psis%d(i%d) = %d' %\n",
    "                  (t + 1, t + 2, t + 2, I[0][t] + 1))\n",
    "        # 输出最优路径\n",
    "        print('最优路径是：', \"->\".join([str(int(i + 1)) for i in I[0]]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 习题10.1\n",
    "&emsp;&emsp;给定盒子和球组成的隐马尔可夫模型$\\lambda=(A,B,\\pi)$，其中，$$A=\\left[\\begin{array}{ccc}0.5&0.2&0.3\\\\0.3&0.5&0.2\\\\0.2&0.3&0.5\\end{array}\\right], \\quad B=\\left[\\begin{array}{cc}0.5&0.5\\\\0.4&0.6\\\\0.7&0.3\\end{array}\\right], \\quad \\pi=(0.2,0.4,0.4)^T$$设$T=4,O=(红,白,红,白)$，试用后向算法计算$P(O|\\lambda)$。\n",
    "\n",
    "**解答：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "#习题10.1\n",
    "Q = [1, 2, 3]\n",
    "V = ['红', '白']\n",
    "A = [[0.5, 0.2, 0.3], [0.3, 0.5, 0.2], [0.2, 0.3, 0.5]]\n",
    "B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]]\n",
    "# O = ['红', '白', '红', '红', '白', '红', '白', '白']\n",
    "O = ['红', '白', '红', '白']    #习题10.1的例子\n",
    "PI = [[0.2, 0.4, 0.4]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "delta1(1) = pi1 * b1(o1) = 0.20 * 0.50 = 0.10\n",
      "psis1(1) = 0\n",
      "delta1(2) = pi2 * b2(o1) = 0.40 * 0.40 = 0.16\n",
      "psis1(2) = 0\n",
      "delta1(3) = pi3 * b3(o1) = 0.40 * 0.70 = 0.28\n",
      "psis1(3) = 0\n",
      "delta2(1) = max[delta1(j)aj1]b1(o2) = 0.06 * 0.50 = 0.02800\n",
      "psis2(1) = argmax[delta1(j)aj1] = 2\n",
      "delta2(2) = max[delta1(j)aj2]b2(o2) = 0.08 * 0.60 = 0.05040\n",
      "psis2(2) = argmax[delta1(j)aj2] = 2\n",
      "delta2(3) = max[delta1(j)aj3]b3(o2) = 0.14 * 0.30 = 0.04200\n",
      "psis2(3) = argmax[delta1(j)aj3] = 2\n",
      "delta3(1) = max[delta2(j)aj1]b1(o3) = 0.02 * 0.50 = 0.00756\n",
      "psis3(1) = argmax[delta2(j)aj1] = 1\n",
      "delta3(2) = max[delta2(j)aj2]b2(o3) = 0.03 * 0.40 = 0.01008\n",
      "psis3(2) = argmax[delta2(j)aj2] = 1\n",
      "delta3(3) = max[delta2(j)aj3]b3(o3) = 0.02 * 0.70 = 0.01470\n",
      "psis3(3) = argmax[delta2(j)aj3] = 2\n",
      "delta4(1) = max[delta3(j)aj1]b1(o4) = 0.00 * 0.50 = 0.00189\n",
      "psis4(1) = argmax[delta3(j)aj1] = 0\n",
      "delta4(2) = max[delta3(j)aj2]b2(o4) = 0.01 * 0.60 = 0.00302\n",
      "psis4(2) = argmax[delta3(j)aj2] = 1\n",
      "delta4(3) = max[delta3(j)aj3]b3(o4) = 0.01 * 0.30 = 0.00220\n",
      "psis4(3) = argmax[delta3(j)aj3] = 2\n",
      "i4 = argmax[deltaT(i)] = 2\n",
      "i3 = psis4(i4) = 2\n",
      "i2 = psis3(i3) = 2\n",
      "i1 = psis2(i2) = 3\n",
      "最优路径是： 3->2->2->2\n"
     ]
    }
   ],
   "source": [
    "HMM = HiddenMarkov()\n",
    "# HMM.forward(Q, V, A, B, O, PI)\n",
    "# HMM.backward(Q, V, A, B, O, PI)\n",
    "HMM.viterbi(Q, V, A, B, O, PI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 习题10.2\n",
    "\n",
    "&emsp;&emsp;给定盒子和球组成的隐马尔可夫模型$\\lambda=(A,B,\\pi)$，其中，$$A=\\left[\\begin{array}{ccc}0.5&0.1&0.4\\\\0.3&0.5&0.2\\\\0.2&0.2&0.6\\end{array}\\right], \\quad B=\\left[\\begin{array}{cc}0.5&0.5\\\\0.4&0.6\\\\0.7&0.3\\end{array}\\right], \\quad \\pi=(0.2,0.3,0.5)^T$$设$T=8,O=(红,白,红,红,白,红,白,白)$，试用前向后向概率计算$P(i_4=q_3|O,\\lambda)$\n",
    "\n",
    "**解答：**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "Q = [1, 2, 3]\n",
    "V = ['红', '白']\n",
    "A = [[0.5, 0.2, 0.3], [0.3, 0.5, 0.2], [0.2, 0.3, 0.5]]\n",
    "B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]]\n",
    "O = ['红', '白', '红', '红', '白', '红', '白', '白']\n",
    "PI = [[0.2, 0.3, 0.5]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "alpha1(1) = p0b0b(o1) = 0.100000\n",
      "alpha1(2) = p1b1b(o1) = 0.120000\n",
      "alpha1(3) = p2b2b(o1) = 0.350000\n",
      "alpha2(1) = [sigma alpha0(i)ai0]b0(o1) = 0.078000\n",
      "alpha2(2) = [sigma alpha0(i)ai1]b1(o1) = 0.111000\n",
      "alpha2(3) = [sigma alpha0(i)ai2]b2(o1) = 0.068700\n",
      "alpha3(1) = [sigma alpha1(i)ai0]b0(o2) = 0.043020\n",
      "alpha3(2) = [sigma alpha1(i)ai1]b1(o2) = 0.036684\n",
      "alpha3(3) = [sigma alpha1(i)ai2]b2(o2) = 0.055965\n",
      "alpha4(1) = [sigma alpha2(i)ai0]b0(o3) = 0.021854\n",
      "alpha4(2) = [sigma alpha2(i)ai1]b1(o3) = 0.017494\n",
      "alpha4(3) = [sigma alpha2(i)ai2]b2(o3) = 0.033758\n",
      "alpha5(1) = [sigma alpha3(i)ai0]b0(o4) = 0.011463\n",
      "alpha5(2) = [sigma alpha3(i)ai1]b1(o4) = 0.013947\n",
      "alpha5(3) = [sigma alpha3(i)ai2]b2(o4) = 0.008080\n",
      "alpha6(1) = [sigma alpha4(i)ai0]b0(o5) = 0.005766\n",
      "alpha6(2) = [sigma alpha4(i)ai1]b1(o5) = 0.004676\n",
      "alpha6(3) = [sigma alpha4(i)ai2]b2(o5) = 0.007188\n",
      "alpha7(1) = [sigma alpha5(i)ai0]b0(o6) = 0.002862\n",
      "alpha7(2) = [sigma alpha5(i)ai1]b1(o6) = 0.003389\n",
      "alpha7(3) = [sigma alpha5(i)ai2]b2(o6) = 0.001878\n",
      "alpha8(1) = [sigma alpha6(i)ai0]b0(o7) = 0.001411\n",
      "alpha8(2) = [sigma alpha6(i)ai1]b1(o7) = 0.001698\n",
      "alpha8(3) = [sigma alpha6(i)ai2]b2(o7) = 0.000743\n",
      "beta8(1) = 1\n",
      "beta8(2) = 1\n",
      "beta8(3) = 1\n",
      "beta7(1) = sigma[a1jbj(o8)beta8(j)] = (0.50 * 0.50 * 1.00 + 0.20 * 0.60 * 1.00 + 0.30 * 0.30 * 1.00 + 0) = 0.460\n",
      "beta7(2) = sigma[a2jbj(o8)beta8(j)] = (0.30 * 0.50 * 1.00 + 0.50 * 0.60 * 1.00 + 0.20 * 0.30 * 1.00 + 0) = 0.510\n",
      "beta7(3) = sigma[a3jbj(o8)beta8(j)] = (0.20 * 0.50 * 1.00 + 0.30 * 0.60 * 1.00 + 0.50 * 0.30 * 1.00 + 0) = 0.430\n",
      "beta6(1) = sigma[a1jbj(o7)beta7(j)] = (0.50 * 0.50 * 0.46 + 0.20 * 0.60 * 0.51 + 0.30 * 0.30 * 0.43 + 0) = 0.215\n",
      "beta6(2) = sigma[a2jbj(o7)beta7(j)] = (0.30 * 0.50 * 0.46 + 0.50 * 0.60 * 0.51 + 0.20 * 0.30 * 0.43 + 0) = 0.248\n",
      "beta6(3) = sigma[a3jbj(o7)beta7(j)] = (0.20 * 0.50 * 0.46 + 0.30 * 0.60 * 0.51 + 0.50 * 0.30 * 0.43 + 0) = 0.202\n",
      "beta5(1) = sigma[a1jbj(o6)beta6(j)] = (0.50 * 0.50 * 0.21 + 0.20 * 0.40 * 0.25 + 0.30 * 0.70 * 0.20 + 0) = 0.116\n",
      "beta5(2) = sigma[a2jbj(o6)beta6(j)] = (0.30 * 0.50 * 0.21 + 0.50 * 0.40 * 0.25 + 0.20 * 0.70 * 0.20 + 0) = 0.110\n",
      "beta5(3) = sigma[a3jbj(o6)beta6(j)] = (0.20 * 0.50 * 0.21 + 0.30 * 0.40 * 0.25 + 0.50 * 0.70 * 0.20 + 0) = 0.122\n",
      "beta4(1) = sigma[a1jbj(o5)beta5(j)] = (0.50 * 0.50 * 0.12 + 0.20 * 0.60 * 0.11 + 0.30 * 0.30 * 0.12 + 0) = 0.053\n",
      "beta4(2) = sigma[a2jbj(o5)beta5(j)] = (0.30 * 0.50 * 0.12 + 0.50 * 0.60 * 0.11 + 0.20 * 0.30 * 0.12 + 0) = 0.058\n",
      "beta4(3) = sigma[a3jbj(o5)beta5(j)] = (0.20 * 0.50 * 0.12 + 0.30 * 0.60 * 0.11 + 0.50 * 0.30 * 0.12 + 0) = 0.050\n",
      "beta3(1) = sigma[a1jbj(o4)beta4(j)] = (0.50 * 0.50 * 0.05 + 0.20 * 0.40 * 0.06 + 0.30 * 0.70 * 0.05 + 0) = 0.028\n",
      "beta3(2) = sigma[a2jbj(o4)beta4(j)] = (0.30 * 0.50 * 0.05 + 0.50 * 0.40 * 0.06 + 0.20 * 0.70 * 0.05 + 0) = 0.026\n",
      "beta3(3) = sigma[a3jbj(o4)beta4(j)] = (0.20 * 0.50 * 0.05 + 0.30 * 0.40 * 0.06 + 0.50 * 0.70 * 0.05 + 0) = 0.030\n",
      "beta2(1) = sigma[a1jbj(o3)beta3(j)] = (0.50 * 0.50 * 0.03 + 0.20 * 0.40 * 0.03 + 0.30 * 0.70 * 0.03 + 0) = 0.015\n",
      "beta2(2) = sigma[a2jbj(o3)beta3(j)] = (0.30 * 0.50 * 0.03 + 0.50 * 0.40 * 0.03 + 0.20 * 0.70 * 0.03 + 0) = 0.014\n",
      "beta2(3) = sigma[a3jbj(o3)beta3(j)] = (0.20 * 0.50 * 0.03 + 0.30 * 0.40 * 0.03 + 0.50 * 0.70 * 0.03 + 0) = 0.016\n",
      "beta1(1) = sigma[a1jbj(o2)beta2(j)] = (0.50 * 0.50 * 0.02 + 0.20 * 0.60 * 0.01 + 0.30 * 0.30 * 0.02 + 0) = 0.007\n",
      "beta1(2) = sigma[a2jbj(o2)beta2(j)] = (0.30 * 0.50 * 0.02 + 0.50 * 0.60 * 0.01 + 0.20 * 0.30 * 0.02 + 0) = 0.007\n",
      "beta1(3) = sigma[a3jbj(o2)beta2(j)] = (0.20 * 0.50 * 0.02 + 0.30 * 0.60 * 0.01 + 0.50 * 0.30 * 0.02 + 0) = 0.006\n",
      "P(O|lambda) = 0.2 * 0.5 * 0.00698 + 0.3 * 0.4 * 0.00741 + 0.5 * 0.7 * 0.00647 + 0 = 0.003852\n"
     ]
    }
   ],
   "source": [
    "HMM.forward(Q, V, A, B, O, PI)\n",
    "HMM.backward(Q, V, A, B, O, PI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可知，$\\displaystyle P(i_4=q_3|O,\\lambda)=\\frac{P(i_4=q_3,O|\\lambda)}{P(O|\\lambda)}=\\frac{\\alpha_4(3)\\beta_4(3)}{P(O|\\lambda)}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "alpha4(3)= 0.033757709999999996\n",
      "beta4(3)= 0.049728909999999994\n",
      "P(O|lambda)= 0.0038519735794910986\n",
      "P(i4=q3|O,lambda) = 0.4358114321796269\n"
     ]
    }
   ],
   "source": [
    "print(\"alpha4(3)=\", HMM.alphas[3 - 1][4 - 1])\n",
    "print(\"beta4(3)=\", HMM.betas[3 - 1][4 - 1])\n",
    "print(\"P(O|lambda)=\", HMM.backward_P[0])\n",
    "result = (HMM.alphas[3 - 1][4 - 1] *\n",
    "          HMM.betas[3 - 1][4 - 1]) / HMM.backward_P[0]\n",
    "print(\"P(i4=q3|O,lambda) =\", result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 习题10.3\n",
    "\n",
    "&emsp;&emsp;在习题10.1中，试用维特比算法求最优路径$I^*=(i_1^*,i_2^*,i_3^*,i_4^*)$。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "delta1(1) = pi1 * b1(o1) = 0.20 * 0.50 = 0.10\n",
      "psis1(1) = 0\n",
      "delta1(2) = pi2 * b2(o1) = 0.40 * 0.40 = 0.16\n",
      "psis1(2) = 0\n",
      "delta1(3) = pi3 * b3(o1) = 0.40 * 0.70 = 0.28\n",
      "psis1(3) = 0\n",
      "delta2(1) = max[delta1(j)aj1]b1(o2) = 0.06 * 0.50 = 0.02800\n",
      "psis2(1) = argmax[delta1(j)aj1] = 2\n",
      "delta2(2) = max[delta1(j)aj2]b2(o2) = 0.08 * 0.60 = 0.05040\n",
      "psis2(2) = argmax[delta1(j)aj2] = 2\n",
      "delta2(3) = max[delta1(j)aj3]b3(o2) = 0.14 * 0.30 = 0.04200\n",
      "psis2(3) = argmax[delta1(j)aj3] = 2\n",
      "delta3(1) = max[delta2(j)aj1]b1(o3) = 0.02 * 0.50 = 0.00756\n",
      "psis3(1) = argmax[delta2(j)aj1] = 1\n",
      "delta3(2) = max[delta2(j)aj2]b2(o3) = 0.03 * 0.40 = 0.01008\n",
      "psis3(2) = argmax[delta2(j)aj2] = 1\n",
      "delta3(3) = max[delta2(j)aj3]b3(o3) = 0.02 * 0.70 = 0.01470\n",
      "psis3(3) = argmax[delta2(j)aj3] = 2\n",
      "delta4(1) = max[delta3(j)aj1]b1(o4) = 0.00 * 0.50 = 0.00189\n",
      "psis4(1) = argmax[delta3(j)aj1] = 0\n",
      "delta4(2) = max[delta3(j)aj2]b2(o4) = 0.01 * 0.60 = 0.00302\n",
      "psis4(2) = argmax[delta3(j)aj2] = 1\n",
      "delta4(3) = max[delta3(j)aj3]b3(o4) = 0.01 * 0.30 = 0.00220\n",
      "psis4(3) = argmax[delta3(j)aj3] = 2\n",
      "i4 = argmax[deltaT(i)] = 2\n",
      "i3 = psis4(i4) = 2\n",
      "i2 = psis3(i3) = 2\n",
      "i1 = psis2(i2) = 3\n",
      "最优路径是： 3->2->2->2\n"
     ]
    }
   ],
   "source": [
    "Q = [1, 2, 3]\n",
    "V = ['红', '白']\n",
    "A = [[0.5, 0.2, 0.3], [0.3, 0.5, 0.2], [0.2, 0.3, 0.5]]\n",
    "B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]]\n",
    "O = ['红', '白', '红', '白']\n",
    "PI = [[0.2, 0.4, 0.4]]\n",
    "\n",
    "HMM = HiddenMarkov()\n",
    "HMM.viterbi(Q, V, A, B, O, PI)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 习题10.4\n",
    "&emsp;&emsp;试用前向概率和后向概率推导$$P(O|\\lambda)=\\sum_{i=1}^N\\sum_{j=1}^N\\alpha_t(i)a_{ij}b_j(o_{t+1})\\beta_{t+1}(j),\\quad t=1,2,\\cdots,T-1$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答：**  \n",
    "$$\\begin{aligned}\n",
    "P(O|\\lambda)\n",
    "&= P(o_1,o_2,...,o_T|\\lambda) \\\\\n",
    "&= \\sum_{i=1}^N P(o_1,..,o_t,i_t=q_i|\\lambda) P(o_{t+1},..,o_T|i_t=q_i,\\lambda) \\\\\n",
    "&= \\sum_{i=1}^N \\sum_{j=1}^N P(o_1,..,o_t,i_t=q_i|\\lambda) P(o_{t+1},i_{t+1}=q_j|i_t=q_i,\\lambda)P(o_{t+2},..,o_T|i_{t+1}=q_j,\\lambda) \\\\\n",
    "&= \\sum_{i=1}^N \\sum_{j=1}^N [P(o_1,..,o_t,i_t=q_i|\\lambda) P(o_{t+1}|i_{t+1}=q_j,\\lambda) P(i_{t+1}=q_j|i_t=q_i,\\lambda) \\\\\n",
    "& \\quad \\quad \\quad \\quad P(o_{t+2},..,o_T|i_{t+1}=q_j,\\lambda)] \\\\\n",
    "&= \\sum_{i=1}^N \\sum_{j=1}^N \\alpha_t(i) a_{ij} b_j(o_{t+1}) \\beta_{t+1}(j),{\\quad}t=1,2,...,T-1\n",
    "\\end{aligned}$$\n",
    "命题得证。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 习题10.5\n",
    "\n",
    "&emsp;&emsp;比较维特比算法中变量$\\delta$的计算和前向算法中变量$\\alpha$的计算的主要区别。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答：**  \n",
    "**前向算法：**  \n",
    "1. 初值$$\\alpha_1(i)=\\pi_ib_i(o_i),i=1,2,\\cdots,N$$\n",
    "2. 递推，对$t=1,2,\\cdots,T-1$：$$\\alpha_{t+1}(i)=\\left[\\sum_{j=1}^N \\alpha_t(j) a_{ji} \\right]b_i(o_{t+1})，i=1,2,\\cdots,N$$\n",
    "\n",
    "**维特比算法：**  \n",
    "1. 初始化$$\\delta_1(i)=\\pi_ib_i(o_1),i=1,2,\\cdots,N$$\n",
    "2. 递推，对$t=2,3,\\cdots,T$$$\\delta_t(i)=\\max_{1 \\leqslant j \\leqslant N} [\\delta_{t-1}(j)a_{ji}]b_i(o_t), i=1,2,\\cdots,N$$  \n",
    "\n",
    "&emsp;&emsp;由上面算法可知，计算变量$\\alpha$的时候直接对上个的结果进行数值计算，而计算变量$\\delta$需要在上个结果计算的基础上选择最大值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "参考代码：https://blog.csdn.net/tudaodiaozhale\n",
    "\n",
    "本文代码更新地址：https://github.com/fengdu78/lihang-code\n",
    "\n",
    "习题解答：https://github.com/datawhalechina/statistical-learning-method-solutions-manual\n",
    "\n",
    "中文注释制作：机器学习初学者公众号：ID:ai-start-com\n",
    "\n",
    "配置环境：python 3.5+\n",
    "\n",
    "代码全部测试通过。\n",
    "![gongzhong](../gongzhong.jpg)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
